Given are 1344 ETFs from the US. As shown below, the data consists of the Variables Date, Open, High, Low, Close, Volume and OpenInt. The Variable Price has been added for every Date by taking the mean of the High and Low for every date.
## Date Open High Low Close Volume OpenInt Price
## 1: 2010-07-21 24.333 24.333 23.946 23.946 43321 0 24.1395
## 2: 2010-07-22 24.644 24.644 24.362 24.487 18031 0 24.5030
## 3: 2010-07-23 24.759 24.759 24.314 24.507 8897 0 24.5365
## 4: 2010-07-26 24.624 24.624 24.449 24.595 19443 0 24.5365
## 5: 2010-07-27 24.477 24.517 24.431 24.517 8456 0 24.4740
## 6: 2010-07-28 24.477 24.517 24.352 24.431 4967 0 24.4345
The following calculations have been made on all ETFs with 1000 or more data points. This is the case for 1092 ETFs.
I calculated the Overall returns simply as the percent that the ETF has grown from the first observation until the last.
overall_return <- function(x){
o.return <- (x[which.max(Date)]$Price - x[which.min(Date)]$Price)/x[which.min(Date)]$Price
o.return
}
overall_returns <- sapply(etfs.red, function(x) overall_return(x))
“In finance, volatility is the degree of variation of a trading price series over time.” Wiki
A very common measure of volatility is the Historic Volatility (\(=HV\)).
It is defined as: \[\begin{aligned}
HV &= sd(R) \\
R &= ln(\frac{V_f}{V_i})
\end{aligned}\]
\(R =\) logarithmic return
\(V_i =\) price when market closed on day i (I used the mean price for day i instead) \(V_f =\) price when market closed of the next day (I used the mean price for the next day instead)
#Logarithmic or continuously compounded return
get_log_return <- function(x){
nextP <- c(0, x$Price[1:length(x$Price)-1])
x[, log_return := log(nextP/Price)]
}
get_abs_return <- function(x){
nextP <- c(0, x$Price[1:length(x$Price)-1])
x[, abs_return := nextP-Price]
}
etfs.red <- lapply(etfs.red, get_log_return)
etfs.red <- lapply(etfs.red, get_abs_return)
#Historic Volatility
hist_vol <- sapply(etfs.red, function(x) sd(x$log_return[-1]))
The fractal dimension can be thought of a measure of roughness for a given geometric object (including curves). To calculate the fractal dimension for finencial price charts, various methods have been suggested. Some of them can get quite complicated. I choosed a rather simple one by John Ehlers (Original publication). The Formula to calculate the fractal dimension \(D\) is as follows:
\[\begin{aligned} D &= \frac{Log(HL1 + HL2) - Log(HL)}{Log(2)} \\ HL1 &= \frac{Max(High, \frac{1}{2}N..N)-Min(Low,\frac{1}{2}N..N)}{\frac{1}{2}N} \\ HL2 &= \frac{Max(High, \frac{1}{2}N)-Min(Low,\frac{1}{2}N)}{\frac{1}{2}N} \\ HL &= \frac{(Max(High,N) - Min(Low,N))}{N} \end{aligned}\]
The fractal dimension can be thought of a measure of volatility for a finencial price chart.
#Function to calculate fractal dimension as intruduced by John Ehlers
fractal_dimension <- function(data){
data.half <- nrow(data)/2
first.half <- 1:round(data.half)
data1 <- data[first.half]
data2 <- data[!first.half]
hl1 <- (max(data1$Price) - min(data1$Price)) / data.half
hl2 <- (max(data2$Price) - min(data2$Price)) / data.half
hl <- (max(data$Price) - min(data$Price)) / nrow(data)
D <- (log(hl1 + hl2) - log(hl)) / log(2)
D
}
#Get for all ETFs the fractal dimension 'D'
fractal_dimensions <- sapply(etfs.red, function(x) fractal_dimension(x))
#First look at fractal dimensions
#table(round(fractal_dimensions, 1))
“In finance, the beta (\(\beta\) or beta coefficient) of an investment indicates whether the investment is more or less volatile than the market as a whole.” Wiki
As a reference for the whole market I chose the S&P500 index. I collected the data from Yahoo-Finance.
sp500 <- fread("sp500.csv", sep = ",")
sp500[, Price := apply(sp500, 1, function(x) mean(as.numeric(c(x["High"], x["Low"]))))]
sp500 <- sp500[, c("Date", "Price")]
sp500$Date <- as.Date(sp500$Date)
get_log_return(sp500)
sp500$log_return[1] <- 0
The beta is then defined as: \[beta = \frac{Cov(r_a, r_b)}{Var(r_b)}\]
Where:
\(r_a =\) Log return of ETF
\(r_b =\) Log return of S&P500
sp500.var <- var(sp500$log_return[-1])
get_beta <- function(x, sp500.var){
x$log_return[1] <- 0
#Merge to get overlaping Dates
a <- merge(x, sp500, by = "Date")
x.cov <- cov(a$log_return.x, a$log_return.y)
beta <- x.cov/sp500.var
beta
}
betas <- sapply(etfs.red, function(x) get_beta(x, sp500.var))
In the plot below the distribution for all used 1092 ETFs are shown.
volatility_measures <- data.table(etf = names(etfs.red), hist_vol, betas, sharpe_ratios, fractal_dimensions, overall_returns)
volatility_measures_hist <- melt(volatility_measures[, -c("etf", "overall_returns")], variable.name = "measure")
levels(volatility_measures_hist$measure) <- c("Historic Volatilitys", "Betas", "Sharpe Ratio", "Fractal Dimensions")
ggplot(volatility_measures_hist, aes(value)) +
geom_histogram(bins = 20, col = "black", fill = "lightgrey") +
facet_wrap(~measure, scales = "free") +
labs(title = "Distrubution of volatility measurments") +
theme_minimal()
| Measure | Value | Interpretation |
|---|---|---|
| Historic Volatility | small | Price moves slow |
| big | Price moves fast | |
| Beta | < 1 | Price moves slower than the market |
| 1 | Price moves with the market | |
| > 1 | Price moves faster than the market | |
| Sharpe Ratio | small | High risk, low return |
| big | Low risk, high return | |
| Fractal Dimension | 1 | Simple line (0 volatility) |
| 2 | Area (maximal volatility) |
As the first step for compaaring the volatility measures I looked at the correlation between there measures. Additionally to the volatility measures I tested if there are correlations with the overall returns of the ETFs and the volatility measures. As can be seen in the pairs plot below, there are no correlations between the volatlity measures. The strongest correlation is between the historic volatility and the Sharpe ratio. This should be not surprising, since I defined 0 as the risk free rate. Therefore the Sharpe ratio can be thought of as the mean log return, divided by the historic volatility.
pairs.panels(volatility_measures[, -c("etf")])
Another way of visualising the relation between the volatlity measures and th overall returns is shown below. The ETFs with strong overall returns seem to be somehow related to the volatility measures. This could be an interesting starting point to determine a value for any ETF based on its volatility measures.
g_legend <- function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)}
legend <- ggplot(volatility_measures,
aes(hist_vol, betas, size = overall_returns, color = overall_returns)) +
geom_point() +
scale_color_continuous(low = "lightgrey", high = "black") +
guides(color= guide_legend(title="Log[Overall Return]"), size=guide_legend(title="Log[Overall Return]"))
legend <- g_legend(legend)
p1 <- ggplot(volatility_measures,
aes(hist_vol, betas, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_colour_continuous(low = "lightgrey", high = "black") +
xlab("HV") +
ylab("Beta") +
theme_minimal()
p2 <- ggplot(volatility_measures,
aes(hist_vol, sharpe_ratios_abs, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_color_continuous(low = "lightgrey", high = "black") +
xlab("HV") +
ylab("Sharpe Ratio") +
theme_minimal()
p3 <- ggplot(volatility_measures,
aes(hist_vol, fractal_dimensions, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_color_continuous(low = "lightgrey", high = "black") +
xlab("HV") +
ylab("D") +
theme_minimal()
p4 <- ggplot(volatility_measures,
aes(betas, sharpe_ratios_abs, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_color_continuous(low = "lightgrey", high = "black") +
xlab("Beta") +
ylab("Sharpe Ratio") +
theme_minimal()
p5 <- ggplot(volatility_measures,
aes(betas, fractal_dimensions, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_color_continuous(low = "lightgrey", high = "black") +
xlab("Beta") +
ylab("D") +
theme_minimal()
p6 <- ggplot(volatility_measures,
aes(sharpe_ratios_abs, fractal_dimensions, size = overall_returns, color = overall_returns)) +
geom_point(aes(alpha = 0.3), show.legend = F) +
scale_color_continuous(low = "lightgrey", high = "black") +
xlab("Sharpe Ratio") +
ylab("D") +
theme_minimal()
layout <- rbind(c(1,NA,7),c(2,4,NA),c(3,5,6))
grid.arrange(p1, p2, p3, p4, p5, p6, legend,
layout_matrix = layout,
top = "Volatility Measures of >1000 ETFs")
In the following plots I compare the ETFs with the lowest and highest values for all the calculated volatility measures. Additionally to the Price chart I plotted the daily price fluctuation. This is simply the change of the Price from one day to the next.
price_fluc <- function(price.chart){
next.price <- c(NA, price.chart[1:nrow(price.chart)-1]$Price)
price.chart[, price_change := Price - next.price]
}
plot_price_chart <- function(etfs, variable, variable.name = "", max = TRUE){
require(magrittr)
require(plotly)
require(data.table)
if(max == TRUE){
max.chart <- etfs[[which.max(variable)]]
max_var <- variable[which.max(variable)]
max.chart <- price_fluc(max.chart)
p <- max.chart %>% plot_ly(x = ~Date, y = ~Price) %>% add_lines()
p1 <- max.chart %>%
plot_ly(x = ~Date, y = ~price_change, type = "bar") %>%
layout(yaxis = list(title = "Price Change"))
subplot(p, p1, nrows = 2, shareX = T, titleY = T) %>%
layout(title=paste("Price chart of ",
names(max_var) ,
" (",
variable.name,
" = ",
round(max_var, 4), ")",
sep = ""), showlegend = F)
} else {
min.chart <- etfs[[which.min(variable)]]
min_var <- variable[which.min(variable)]
min.chart <- price_fluc(min.chart)
p <- min.chart %>% plot_ly(x = ~Date, y = ~Price) %>% add_lines()
p1 <- min.chart %>%
plot_ly(x = ~Date, y = ~price_change, type = "bar") %>%
layout(yaxis = list(title = "Price Change"))
subplot(p, p1, nrows = 2, shareX = T, titleY = T) %>%
layout(title=paste("Price chart of ",
names(min_var) ,
" (",
variable.name,
" = ",
round(min_var, 4), ")",
sep = ""), showlegend = F)
}}
Note how the fxo.us ETF has on the 29 September 2008 a huge increase in its price followed by a immidiate decrease. This fluctuation is probably a result of the big market drops on that day.
CNBC on the 29 September 2008: “Investors are stunned and dump stocks frantically until the Dow ends 777 points lower, at 10365.45, its biggest one-day point drop ever. The S&P 500 also logs its biggest one-day point drop, falling 106.59, or 8.8 percent, to 1106.42. The Nasdaq has its biggest one-day point decline since 2000, falling 199.61, or 9.1 percent, to 1983.73.” Link
plot_price_chart(etfs.red, hist_vol, "HV", max = T)
plot_price_chart(etfs.red, hist_vol, "HV", max = F)
In these to ollowing plots it is very striking, that on the 18 August 2016 there is a huge drop compared to all other fluctiations. CNBC on the ‘flash crash’ on the 25 August 2015: “There were 1,278 trading halts for 471 different ETFs and stocks. Because of this, it was not possible to calculate the value of many ETFs, or hedge or trade ETFs and stocks at a ‘correct’ price.” Link
plot_price_chart(etfs.red, fractal_dimensions, "D", max = T)
plot_price_chart(etfs.red, fractal_dimensions, "D", max = F)
plot_price_chart(etfs.red, betas, "Beta", max = T)
plot_price_chart(etfs.red, betas, "Beta", max = F)